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Abstract 

The  use  of  a  primate's  spatial  ability  of  mental  rotation  to  serve  as  a  basis  for  robotic  navigation  has  been 
almost  entirely  overlooked  by  the  robotics  community  to  date.  In  this  paper,  the  role  of  this  cognitive 
capacity  is  presented  as  an  adjunct  to  existing  robotic  control  systems,  with  the  underlying  approach  being 
derived  from  studies  of  primate  spatial  cognition.  Specifically,  optical  flow  is  used  as  a  basis  for  transitory 
representations  (snapshots)  that  are  compared  to  an  a  priori  visual  goal  to  provide  corrective  course  action 
for  a  robot  when  moving  through  the  world.  The  underlying  architecture  and  procedures  are  described. 


I.  Introduction 

The  capacity  for  performing  mental  rotations  has  believed  to  have  been  observed  in  multiple  species  and 
not  solely  humans  and  thus  begs  the  question  as  to  whether  it  might  have  value  in  the  context  of  robotics. 
The  question  persists  regarding  just  what  evolutionary  advantage  is  afforded  by  this  spatial  reasoning 
system,  especially  for  non-human  primates,  who  do  not  use  map  reading.  In  an  ongoing  three  year  project 
for  the  Office  of  Naval  Research,  we  are  investigating  the  role  of  mental  rotation  in  navigation  for 
autonomous  robots,  not  as  a  replacement  for  existing  methods  but  rather  as  a  conjunct  for  certain  situations. 
Using  optic  flow  derived  depth  maps  as  quasi-instantaneous  snapshots  of  the  environment,  a  behavioral 
robotic  architecture  is  enhanced  with  this  semi-reactive  navigational  capability  based  upon  primate  models 
of  optic  flow  and  mental  rotation.  To  support  this  work,  a  vectorial  mathematical  framework  is  developed, 
spanning  the  sensorimotor  and  cognitive  spatial  aspects  of  this  approach.  This  paper  provides  the 
motivation,  background,  theoretical  basis,  computational  models,  and  results  to  date  for  utilizing  primate- 
inspired  mental  rotation  models  as  a  basis  for  intelligent  robotic  navigation. 

While  considerable  scientific  study  has  been  devoted  to  mental  rotation  from  both  a  psychological  and 
neuroscientific  perspective  in  humans  (e.g.,  [Shepard  and  Cooper  82,  Yule  06,  Georgopoulos  and  Pellizzer 
95,  Khooshabeh  00]),  primates  (e.g.,  [Vauclair  et  al  93,  Hopkins  et  al  93,  Kohler  et  al  05]),  and  other 
mammals  (e.g.,  [Mauck  and  Dehnhardt  97]),  little  is  actually  known  about  the  underlying  representations 
and  processes  by  which  it  occurs.  There  are  two  primary  schools  of  thought  regarding  representation:  one 
involving  propositional  assertions  (e.g.,  [Pylyshyn  73,  Anderson  78])  and  the  other  using  analog  visual 
reasoning  (e.g.,  [Khooshabeh  and  Hegarty  10]).  Our  current  research  (Fig.  1)  focuses  on  visual 
representations  directly  derived  from  imagery,  exploring  a  multiplicity  of  representational  options, 
including  basis  optic  flow  fields  [Roberts  et  al  09],  standard  optic  flow  (Fig.  2b)  [Gibson  79,  Farneback 
03],  depth  maps  acquired  either  directly  from  a  Kinect  sensor  (Fig.  2c)  or  generated  from  the  optic  flow 
field,  and  spatial  occupancy  grids  (Fig.  2d)  (e.g.,  [Elfes  89])  where  each  of  these  different  representations  is 
mapped  onto  the  motor  control  system  underlying  the  navigational  control  of  a  mobile  robot  [Arkin  89] . 
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Figure  1:  Multiple  Representational  Pathways  to  Motor  Control 
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Figure  2:  (A)  Last  image  of  optical  flow  sequence.  (B)  Optical  Flow  generated  from  sequence  (C) 
Kinect  depth  map  (D)  Occupancy  grid  generated  from  depth  map 


II.  A  Role  for  Mental  Rotation  in  Robot  Navigation 


While  mental  rotation  and  its  role  in  primate  and  human  navigation  has  been  studied  to  a  degree  by  the 
spatial  cognition  and  neuroscientific  communities  (e.g.,  [Shepard  and  Hurwitz  84,  Wexler  et  al  94, 
Kozhevnikov  et  al  06]),  relatively  few  studies  have  considered  its  role  in  robot  navigation  (e.g.,  [Taylor  et 
al  08]).  There  already  exist  many  excellent  approaches  for  robotic  navigational  control  that  have  developed 
over  the  past  two  decades  (cf.  [Arkin  98,  Thrun  et  al  05]),  so  the  question  that  confronts  us  is  what,  if 
anything,  can  adding  the  cognitive  ability  of  mental  rotation  to  a  robot  provide  above  and  beyond  these 
already  existing  capabilities.  [Norman  and  Shallice  86]  have  shown  that  under  certain  circumstances  willed 
cognitive  mechanisms  beyond  automatic  control  are  important,  especially  in  situations  involving 
troubleshooting,  dangerous  or  difficult  actions,  or  in  novel  or  poorly  learned  situations.  This  can  provide 
insight  into  opportunities  to  enhance  already  competent  navigational  methods. 

The  goal  of  our  research  is  not  necessarily  to  supplant  existing  robotic  control  strategies,  but  rather 
augment  them  with  spatial  capabilities  derived  from  mental  rotation  research  that  facilitate  proper  action 
under  these  types  of  conditions.  This  requires  two  components:  (1)  recognition  that  such  a  situation 
warrants  its  use,  i.e.,  the  inadequacy  of  existing  control  methods  is  recognized  through  cognizant  failure 
[Gat  and  Dorais  94];  and  (2)  the  application  of  visual  representations  derived  from  optic  flow  that  are  used 
in  process  analogous  to  mental  rotation,  assisting  in  providing  the  necessary  impetus  for  correct  robot 
action  at  the  right  time.  To  accomplish  the  latter,  a  visual  snapshot  of  a  depth  map  or  occupancy  grid  of  a 
position  near  the  spatial  location  of  the  robot’s  goal  is  stored  a  priori.  A  series  of  these  serve  as  waypoints 
for  the  overall  planned  path  which  the  robot  must  satisfy.  Based  on  small  incremental  translational  and 
rotational  motions,  local  depth  maps  are  created  at  the  robot’s  present  position  while  it  is  moving.  These  are 
local  snapshots  of  the  3D  world  structure  that  the  robot  currently  perceives  (Fig.  3),  cast  in  the  same 
representational  framework  as  the  stored  visual  goal  state. 

Various  correlation  methods  are  being  explored  (e.g.,  [Hirshmuller  01,  Lewis  95,  Elfes  89])  to  determine 
which  might  provide  the  best  real-time  correlation  between  local  and  goal  depth  maps  (or  even  the  direct 
optic  flow  representations)  to  provide  intermittent  yet  frequent  motor  control  updates  for  the  robot. 

(Top)  Snapshot  (Bottom)  Goal  State 


Motor  Control 
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Figure  3:  Flow  of  control  from  optical  flow  through  spatial  mental  rotation  to  motor  behavior 
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Figure  4:  Correlation  process 


Another  goal  of  this  research  is  to  find  a  spanning  mathematical  framework  (interlingua)  that  can  unite 
sensory  processing,  cognitive  spatial  manipulation,  and  motor  control.  Optical  flow  [Gibson  79,  Farneback 
et  al  03],  a  perceptual  process  for  the  derivation  of  structure  from  motion,  has  consistently  used  vectorial 
representations  to  represent  the  flow  between  monocular  images,  computing  depth  by  correspondence  and 
knowledge  of  the  egomotion  of  the  agent.  Vectorial  representations  are  omnipresent  in  motor  control  as 
well,  both  in  biological  systems  (e.g.,  [Georgopoulos  86,  Georgopoulos  et  al  86,  Bizzi  et  al  91])  and  robotic 
systems  (e.g.,  [Arkin  89,  Giszter  et  al  00]).  While  hybrid  deliberative/reactive  architectures  provide  a 
means  for  integrating  higher  level  cognitive  planning  with  purely  reactive  control  [Arkin  98],  a  means  for 
integrating  mental  rotation  into  the  navigational  process  to  date  has  been  neglected  in  robotics.  Casting  the 
mental  rotation  as  a  motor  process  [Wexler  94]  (fig.  3)  may  yet  yield  the  missing  link  connecting  cognition 
into  the  sensorimotor  framework  for  robotics. 


III.  Summary 

This  research  directly  applies  theories  of  mental  rotation  from  the  spatial  cognition  community  to  robot 
navigation  with  the  goal  of  augmenting  existing  robotic  navigational  systems  to  cope  with  situations  that 
might  otherwise  be  confounding.  Using  optical  flow  as  the  basis  for  internal  representation,  a  control 
regime  is  described  which  can  potentially  generate  effective  navigational  solutions  when  confronted  with 
novel,  dangerous,  confusing,  or  poorly  learned,  situations.  This  short  paper  outlines  the  architectural 
framework  by  which  such  augmentation  can  be  provided  based  upon  the  correlation  of  visual  depth  maps 
representing  transient  positions  and  a  desired  goal  state.  In  future  work,  this  ongoing  research  will  report  on 
the  efficacy  of  these  various  representations  and  correlations  methods  as  the  basis  by  which  such 
navigational  augmentation  can  be  provided. 
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